Image processing method and associated apparatus

ABSTRACT

An image processing method includes: receiving a plurality of images, the images being captured under different view points; and performing image alignment for the plurality of images by warping the plurality of images, where the plurality of images are warped according to a set of parameters, and the set of parameters are obtained by finding a solution constrained to predetermined ranges of physical camera parameters. In particular, the step of performing the image alignment further includes: automatically performing the image alignment to reproduce a three-dimensional (3D) visual effect, where the plurality of images is captured by utilizing a camera module, and the camera module is not calibrated with regard to the view points. For example, the 3D visual effect can be a multi-angle view (MAV) visual effect. In another example, the 3D visual effect can be a 3D panorama visual effect. An associated apparatus is also provided.

BACKGROUND

The present invention relates to three-dimensional (3D) visual effectreproduction, and more particularly, to an image processing method, andto an associated apparatus.

According to the related art, 3D visual effect reproduction typicallyrequires preparation of source images and complicated calculations.During a preparation stage, no matter whether the resolution of thesource images is high or low, some problems may occur. For example, thesource images should be captured from a plurality of pre-calibratedcameras, where the pre-calibrated cameras should have been calibratedwith respect to predetermined view points or predetermined lines ofviews, which causes difficulty of the preparation of the source images.In another example, in order to perform the complicated calculationsefficiently, it is required to prepare a high end computer having highcalculation power, where the high end computer would never be replacedby a conventional multifunctional mobile phone since it seems unlikelythat the conventional multifunctional mobile phone can work well underthe heavy calculation load of the complicated calculations. That is, theconventional multifunctional mobile phone can never be a total solutionto 3D production/reproduction. In conclusion, the related art does notserve the end user well. Thus, a novel method is required for performingimage processing regarding 3D visual effect reproduction in a smart androbust manner, in order to implement the preparation of the sourceimages mentioned above and associated calculations within a portableelectronic device such as a multifunctional mobile phone.

SUMMARY

It is therefore an objective of the claimed invention to provide animage processing method, and to provide an associated apparatus, inorder to solve the above-mentioned problems.

It is another objective of the claimed invention to provide an imageprocessing method, and to provide an associated apparatus, in order toimplement the preparation of the source images mentioned above andassociated calculations within a portable electronic device such as amultifunctional mobile phone.

It is another objective of the claimed invention to provide an imageprocessing method, and to provide an associated apparatus, in order tocarry out a total solution to three-dimensional (3D)production/reproduction by utilizing a portable electronic device (e.g.a mobile phone, a laptop computer, or a tablet).

An exemplary embodiment of an image processing method comprises:receiving image data of a plurality of images, the images being capturedunder different view points (or along different lines of views); andperforming image alignment for the plurality of images by warping theplurality of images according to the image data, wherein the pluralityof images are warped according to a set of parameters, and the set ofparameters are obtained by finding a solution constrained topredetermined ranges of physical camera parameters.

An exemplary embodiment of an apparatus for performing image processingis provided, where the apparatus comprises at least one portion of anelectronic device. The apparatus comprises: a storage and a processingcircuit. The storage is arranged to temporarily store information. Inaddition, the processing circuit is arranged to control operations ofthe electronic device, to receive image data of a plurality of images,the images being captured under different view points (or alongdifferent lines of views), to temporarily store the image data into thestorage, and to perform image alignment for the plurality of images bywarping the plurality of images according to the image data, wherein theplurality of images are warped according to a set of parameters, and theset of parameters are obtained by finding a solution constrained topredetermined ranges of physical camera parameters.

These and other objectives of the present invention will no doubt becomeobvious to those of ordinary skill in the art after reading thefollowing detailed description of the preferred embodiment that isillustrated in the various figures and drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram of an apparatus for performing image processingaccording to a first embodiment of the present invention.

FIG. 2 illustrates the apparatus shown in FIG. 1 according to anembodiment of the present invention, where the apparatus of thisembodiment is a mobile phone.

FIG. 3 illustrates the apparatus shown in FIG. 1 according to anotherembodiment of the present invention, where the apparatus of thisembodiment is a personal computer such as a laptop computer.

FIG. 4 illustrates a flowchart of an image processing method accordingto an embodiment of the present invention.

FIG. 5 illustrates an input image and some transformation imagesgenerated during a learning/training procedure involved with the imageprocessing method shown in FIG. 4 according to an embodiment of thepresent invention, where the learning/training procedure is utilized fordetermining a predefined solution space.

FIG. 6 illustrates some images obtained from multi-view verticalalignment involved with the image processing method shown in FIG. 4according to an embodiment of the present invention.

FIG. 7 illustrates one of the images obtained from the multi-viewvertical alignment and an associated image obtained from horizontalalignment involved with the image processing method shown in FIG. 4according to an embodiment of the present invention.

FIG. 8 illustrates another of the images obtained from the multi-viewvertical alignment and an associated image obtained from the horizontalalignment according to the embodiment shown in FIG. 7.

FIG. 9 illustrates another of the images obtained from the multi-viewvertical alignment and an associated image obtained from the horizontalalignment according to the embodiment shown in FIG. 7.

FIG. 10 illustrates a disparity histogram involved with the imageprocessing method shown in FIG. 4 according to an embodiment of thepresent invention.

FIG. 11 illustrates the apparatus shown in FIG. 1 according to anembodiment of the present invention, where the processing circuitthereof comprises some processing modules involved with the imageprocessing method shown in FIG. 4, and can selectively operate with aidof motion information generated by some motion sensors when needed.

FIG. 12 illustrates two images under processing of global/localcoordinate transformation involved with the image processing methodshown in FIG. 4 according to an embodiment of the present invention.

FIG. 13 illustrates the global/local coordinate transformation performedon the two images shown in FIG. 12.

FIG. 14 illustrates two images aligned to background motion(s) forthree-dimensional (3D) display involved with the image processing methodshown in FIG. 4 according to an embodiment of the present invention.

FIG. 15 illustrates two images aligned to foreground motion(s) for amulti-angle view (MAV) visual effect involved with the image processingmethod shown in FIG. 4 according to another embodiment of the presentinvention.

FIG. 16 illustrates a portrait mode of a plurality of display modesinvolved with the image processing method shown in FIG. 4 according toan embodiment of the present invention.

FIG. 17 illustrates a panorama mode of the plurality of display modesaccording to the embodiment shown in FIG. 16.

DETAILED DESCRIPTION

Certain terms are used throughout the following description and claims,which refer to particular components. As one skilled in the art willappreciate, electronic equipment manufacturers may refer to a componentby different names. This document does not intend to distinguish betweencomponents that differ in name but not in function. In the followingdescription and in the claims, the terms “include” and “comprise” areused in an open-ended fashion, and thus should be interpreted to mean“include, but not limited to . . . ”. Also, the term “couple” isintended to mean either an indirect or direct electrical connection.Accordingly, if one device is coupled to another device, that connectionmay be through a direct electrical connection, or through an indirectelectrical connection via other devices and connections.

Please refer to FIG. 1, which illustrates a diagram of an apparatus 100for performing image processing according to a first embodiment of thepresent invention. According to different embodiments, such as the firstembodiment and some variations thereof, the apparatus 100 may compriseat least one portion (e.g. a portion or all) of an electronic device.For example, the apparatus 100 may comprise a portion of the electronicdevice mentioned above, and more particularly, can be a control circuitsuch as an integrated circuit (IC) within the electronic device. Inanother example, the apparatus 100 can be the whole of the electronicdevice mentioned above. In another example, the apparatus 100 can be anaudio/video system comprising the electronic device mentioned above.Examples of the electronic device may include, but not limited to, amobile phone (e.g. a multifunctional mobile phone), a personal digitalassistant (PDA), a portable electronic device such as the so-calledtablet (based on a generalized definition), and a personal computer suchas a tablet personal computer (which can also be referred to as thetablet, for simplicity), a laptop computer, or desktop computer.

As shown in FIG. 1, the apparatus 100 comprises a processing circuit 110and a storage 120. The storage 120 is arranged to temporarily storeinformation, such as information carried by at least one input signal108 that is inputted into the processing circuit 110. For example, thestorage 120 can be a memory (e.g. a volatile memory such as a randomaccess memory (RAM), or a non-volatile memory such as a Flash memory),or can be a hard disk drive (HDD). In addition, the processing circuit110 is arranged to control operations of the electronic device, toreceive image data of a plurality of images, the images being capturedunder different view points (or along different lines of views), totemporarily store the image data into the storage 120, and to performimage alignment for the plurality of images by warping the plurality ofimages according to the image data, where the plurality of images arewarped according to a set of parameters, and the set of parameters areobtained by finding a solution constrained to predetermined ranges ofphysical camera parameters. For example, the images are captured, andmore particularly, are arbitrarily captured under different view pointsby utilizing a camera module of the electronic device mentioned above,where the aforementioned image data can be received through the inputsignal 108 that is input into the processing circuit 110. This is forillustrative purposes only, and is not meant to be a limitation of thepresent invention. According to some variations of this embodiment, theimages are captured, and more particularly, are arbitrarily capturedunder different view points by utilizing an external device such as ahand-held camera.

Please note that it is unnecessary for the camera module mentioned aboveto be calibrated. More particularly, the camera module of thisembodiment is not calibrated with regard to the view points (or thelines of views) mentioned above. For example, in a situation where theelectronic device is light enough for a user to hold it easily, the usermay hold the electronic device to arbitrarily capture the images of someobjects under these different view points. Then, the processing circuit110 automatically performs the image alignment to reproduce athree-dimensional (3D) visual effect, and more particularly, generates3D images to reproduce the 3D visual effect, where the 3D images maycomprise emulated images that are not generated by utilizing any camerasuch as the camera module mentioned above. The processing circuit 110may output information of the 3D images through at least one outputsignal 128 that carries the information of the 3D images. In practice, ascreen of the electronic device can be utilized for displaying animationbased upon the 3D images to reproduce the 3D visual effect. This is forillustrative purposes only, and is not meant to be a limitation of thepresent invention. According to some variations of this embodiment, thescreen may provide the user with stereoscopic views based upon the 3Dimages to reproduce the 3D visual effect. No matter whether the screenis designed to provide stereoscopic views or not, examples of the 3Dvisual effect may comprise (but not limited to) a multi-angle view (MAV)visual effect and a 3D panorama visual effect. According to somevariations of this embodiment, the apparatus 100 can output theinformation of the 3D images, in order to reproduce the 3D visual effectby utilizing an external display device.

FIG. 2 illustrates the apparatus 100 shown in FIG. 1 according to anembodiment of the present invention, where the apparatus 100 of thisembodiment is a mobile phone, and therefore, is labeled “Mobile phone”in FIG. 2. A camera module 130 (labeled “Camera” in FIG. 2, for brevity)is taken as an example of the camera module mentioned in the firstembodiment, and is installed within the apparatus 100 mentioned above(i.e. the mobile phone in this embodiment), which means the apparatus100 comprises the camera module 130. According to this embodiment, thecamera module 130 is positioned around an upper side of the apparatus100. This is for illustrative purposes only, and is not meant to be alimitation of the present invention. According to some variations ofthis embodiment, the camera module 130 can be positioned around anotherside of the apparatus 100. In addition, a touch screen 150 (labeled“Screen” in FIG. 2, for brevity) is taken as an example of the screenmentioned in the first embodiment, and is installed within the apparatus100 mentioned above, which means the apparatus 100 comprises the touchscreen 150. As shown in FIG. 2, the camera module 130 can be utilizedfor capturing the plurality of images mentioned above. For example, byanalyzing the image data of the images, the processing circuit 110 canperform feature extraction and feature matching to determine/find outthe aforementioned solution constrained to the predetermined ranges ofphysical camera parameters, such as some predetermined ranges ofphysical parameters of the camera module 130 (e.g. directions/angles ofthe lines of views of the camera module 130). As a result, theprocessing circuit 110 can generate the 3D images bounded to theaforementioned solution, in order to reproduce the 3D visual effectwithout introducing visible artifacts.

FIG. 3 illustrates the apparatus 100 shown in FIG. 1 according toanother embodiment of the present invention, where the apparatus 100 ofthis embodiment is a personal computer such as a laptop computer, andtherefore, is labeled “Laptop computer” in FIG. 3. The camera module 130(labeled “Camera” in FIG. 3, for brevity) is taken as an example of thecamera module mentioned in the first embodiment, and is installed withinthe apparatus 100 mentioned above (i.e. laptop computer in thisembodiment), which means the apparatus 100 comprises the camera module130. According to this embodiment, the camera module 130 is positionedaround an upper side of the apparatus 100. This is for illustrativepurposes only, and is not meant to be a limitation of the presentinvention. According to some variations of this embodiment, the cameramodule 130 can be positioned around another side of the apparatus 100.In addition, a screen 50 (e.g. a liquid crystal display (LCD) panel) istaken as an example of the screen mentioned in the first embodiment, andis installed within the apparatus 100 mentioned above, which means theapparatus 100 comprises the screen 50. Similar descriptions are notrepeated in detail for this embodiment.

FIG. 4 illustrates a flowchart of an image processing method 200according to an embodiment of the present invention. The imageprocessing method 200 shown in FIG. 4 can be applied to the apparatus100 shown in FIG. 1, and more particualrly, the apparatus 100 of any ofthe embodiments respectively shown in FIG. 3 and FIG. 4. The imageprocessing method 200 is described as follows.

In Step 210, the processing circuit 110 receives image data of aplurality of images, the images being captured under different viewpoints (e.g., the plurality of images disclosed in the firstembodiment). In this embodiment, the aforementioned image data can bereceived through the input signal 108 that is input into the processingcircuit 110. For example, the images are captured under these differentview points (or along different lines of views), and more particularly,are arbitrarily captured by utilizing a camera module such as the cameramodule 130 disclosed above. Please note that it is unnecessary for thecamera module mentioned above to be calibrated. More particularly, thecamera module of this embodiment is not calibrated with regard to theview points.

In Step 220, the processing circuit 110 performs image alignment for theplurality of images by warping the plurality of images according to theimage data, where the plurality of images are warped according to a setof parameters, and the set of parameters are obtained by finding asolution constrained to predetermined ranges of physical cameraparameters. More particularly, the image alignment may include verticalalignment and horizontal alignment, where the horizontal alignment istypically performed after the vertical alignment is performed. Forexample, the horizontal alignment can be performed under disparityanalysis, where the disparity analysis is utilized for analyzing warpedimages of the vertical alignment. This is for illustrative purposesonly, and is not meant to be a limitation of the present invention.According to some variations of this embodiment, thepreparation/beginning of the horizontal alignment can be performed afterthe preparation/beginning of the vertical alignment is performed, andthe horizontal alignment and the vertical alignment can be completed atthe same time when some warping operations are completed.

According to this embodiment, in order to achieve better performanceduring the operations disclosed above, the processing circuit 110 ispreferably arranged to determine the aforementioned predetermined rangesof physical camera parameters in advance by performing operations ofsub-steps (1), (2), (3), (4), and (5) as follows:

(1) the processing circuit 110 controls the camera module 130 to capturea base image, such as one of the plurality of images mentioned in Step210;(2) the processing circuit 110 controls the camera module 130 to capturemultiple reference images, such as others within the plurality of imagesmentioned in Step 210;(3) the processing circuit 110 records one set of physical cameraparameters corresponding to each reference image, where theaforementioned one set of physical camera parameters can be somelocation/coordinate-related physical parameters of the camera module 130disclosed above (e.g. directions/angles of the lines of views of thecamera module 130), and may comprise some physical parameters that arenot location/coordinate-related (e.g. focal lengths and some other lensparameters of the camera module 130);(4) the processing circuit 110 records warps the base image to matcheach reference image according to the recorded set of physical cameraparameters, and therefore, generates a series of warped base imagescorresponding to each reference image; and(5) the processing circuit 110 determines the aforementionedpredetermined ranges of physical camera parameters by finding whetherdifference(s) between warped base images and the reference images isdistinguishable under human vision, where the criterion (or criteria)for determining whether the difference(s) is distinguishable under humanvision or not can be predefined based upon some predefined rules.Thus, the processing circuit 110 eventually determines theaforementioned predetermined ranges of physical camera parameters, inorder to achieve better performance during the operations disclosed inFIG. 4. As a result, for an arbitrary set of physical camera parametersthat respectively fall within the aforementioned predetermined ranges ofphysical camera parameters (e.g. a set of physical camera parameters,each of which falls within the corresponding predetermined ranges of theaforementioned predetermined ranges), no difference between any warpedbase image corresponding to this set of physical camera parameters andthe reference images is distinguishable under human vision.

Please not that, by performing the operations of sub-steps (1), (2),(3), (4), and (5) disclosed above, the solution constrained to theaforementioned predetermined ranges of physical camera parameters (i.e.the solution mentioned in the descriptions for Step 220) can be found,where the solution allows the base image to be arbitrarily warped whilethe associated physical camera parameters of this arbitrarily warpingoperation keep falling within the aforementioned predetermined ranges ofphysical camera parameters. As the aforementioned predetermined rangesof physical camera parameters is preferably determined by findingwhether any difference between warped base images and the referenceimages is distinguishable under human vision, the solution guaranteesthat this arbitrarily warping operation will not cause any artifact thatis distinguishable under human vision. Therefore, no artifact will befound.

In this embodiment, the sub-steps (1), (2), (3), (4), and (5) are takenas examples of the operations of determining the aforementionedpredetermined ranges of physical camera parameters. This is forillustrative purposes only, and is not meant to be a limitation of thepresent invention. According to some variations of this embodiment, itis unnecessary to perform all of the sub-steps (1), (2), (3), (4), and(5). According to some variations of this embodiment, other sub-step(s)may be included.

FIG. 5 illustrates an input image 500 and some transformation images512, 514, 522, and 532 generated during a learning/training procedureinvolved with the image processing method 200 shown in FIG. 4 accordingto an embodiment of the present invention, where the learning/trainingprocedure is utilized for determining a predefined solution space (e.g.a pre-trained solution space).

As shown in FIG. 5, the processing circuit 110 performs similaritytransformation on the input image 500 by performing a plurality ofwarping operations to generate the transformation images 512, 514, 522,and 532, in order to find out the solution mentioned in the descriptionsfor Step 220. Please note that some of these warping operationsperformed during the learning/training procedure may cause visibleartifacts, which are allowed during the learning/training procedure. Inthis embodiment, the criterion (or criteria) for determining whether thedifference mentioned in the sub-step (5) of the embodiment shown in FIG.4 is distinguishable under human vision or not is predefined based uponsome predefined rules, and therefore, the solution mentioned in thedescriptions for Step 220 can be referred to as the predefined solutionspace. For example, the processing circuit 110 provides the user with aninterface, allowing the user to determine whether a transformation imageunder consideration (e.g. one of the transformation images 512, 514,522, and 532) has any artifact that is distinguishable under humanvision. When the user determines that the transformation image underconsideration does not have any artifact, the processing circuit 110expands the predefined solution space (e.g., the predefined solutionspace is expanded to include the ranges of physical camera parameterscorresponding to the transformation image under consideration);otherwise, the processing circuit 110 shrinks the predefined solutionspace (e.g., the predefined solution space is shrunk to exclude theranges of physical camera parameters corresponding to the transformationimage under consideration). Please note that the input image 500 can bethe base image mentioned in the sub-step (1) of the embodiment shown inFIG. 4. This is for illustrative purposes only, and is not meant to be alimitation of the present invention. According to some variations ofthis embodiment, the input image 500 can be one of the reference imagesmentioned in the sub-step (2) of the embodiment shown in FIG. 4.

As a result of performing the similarity transformation, the processingcircuit 110 eventually determines the aforementioned predeterminedranges of physical camera parameters, where any warped image that isbounded within the predefined solution space will not have any artifactthat is distinguishable under human vision. For example, in a situationwhere the transformation image 522 is a warped image that is boundedwithin the predefined solution space, the similarity transformationcorresponding to the transformation image 522 can be considered to bevisually insensible 3D similarity transformation with regard to physicalcamera parameters, where “visually insensible” typically represents“deformation of warped image is hard to be distinguished by humanvision”. Similar descriptions are not repeated in detail for thisembodiment.

In the following embodiments such as those shown in FIGS. 6-10, with theaid of the learning/training results regarding the aforementionedsolution such as the predefined solution space (e.g. the pre-trainedsolution space), the processing circuit 110 is capable of performingvisually insensible image warping. As a result, the image alignmentmentioned in the descriptions for Step 220 (e.g. the vertical alignmentand the horizontal alignment) and the associated image warping (if any,for reproducing the 3D visual effect mentioned above) will not cause anyartifact that is distinguishable under human vision.

FIG. 6 illustrates some images 612, 614, and 616 obtained frommulti-view vertical alignment involved with the image processing method200 shown in FIG. 4 according to an embodiment of the present invention.In this embodiment, the multi-view vertical alignment disclosed in FIG.6 is taken as an example of the vertical alignment mentioned in thedescriptions for Step 220.

As shown in FIG. 6, some video objects respectively shown in the images612, 614, and 616 are the same object in the real world. For example,each of the images 612, 614, and 616 comprises a partial image of thesame person and further comprises a partial image of the same logo(which is illustrated with a warped shape of “LOGO”). The processingcircuit 110 performs feature extraction on each of the images 612, 614,and 616 and performs feature matching for the images 612, 614, and 616to find out some common feature points in each of the images 612, 614,and 616, such as the feature points illustrated with small circles onthe three dashed lines crossing the images 612, 614, and 616 within FIG.6. For example, in each of the images 612, 614, and 616, one of thecommon feature points can be located at the upper right corner of thelogo, another of the common feature points can be located at the lowerleft corner of the logo, and another of the common feature points can belocated at a junction of something worn by the person.

During the multi-view vertical alignment, the processing circuit 110aligns the images 612, 614, and 616 by performing rotating and/orshifting operations of their original images, which are multi-viewimages respectively corresponding to three view points (or three linesof views) and are a portion of the plurality of images mentioned in Step210 in this embodiment. As a result of performing the multi-viewvertical alignment, the common feature points in the images 612, 614,and 616 are aligned to the same vertical locations (or the samehorizontal lines such as the three dashed lines shown in FIG. 6),respectively, where the dashed lines crossing the three images 612, 614,and 616 within FIG. 6 indicates the alignment results of the multi-viewvertical alignment. Thus, the processing circuit 110 performsoptimization over geometry constraint to solve the optimal cameraparameters within a predefined solution space such as that mentionedabove, and more particularly, a predefined visually insensible solutionspace. Similar descriptions are not repeated in detail for thisembodiment.

FIG. 7 illustrates one of the images 612, 614, and 616 obtained from themulti-view vertical alignment, such as the image 612, and an associatedimage 622 obtained from horizontal alignment involved with the imageprocessing method 200 shown in FIG. 4 according to an embodiment of thepresent invention. In addition, FIG. 8 illustrates another of the images612, 614, and 616 obtained from the multi-view vertical alignment, suchas the image 614, and an associated image 624 obtained from thehorizontal alignment according to this embodiment.

Additionally, FIG. 9 illustrates another of the images 612, 614, and 616obtained from the multi-view vertical alignment, such as the image 616,and an associated image 626 obtained from the horizontal alignmentaccording to this embodiment.

The horizontal alignment disclosed in FIG. 7, FIG. 8, and FIG. 9 istaken as an example of the horizontal alignment mentioned in thedescriptions for Step 220. In practice, the processing circuit 110 canperform disparity histogram analysis for two-dimensional (2D)translation of warped images. For example, the processing circuit 110calculates the number of pixels with regard to displacement (moreparticularly, horizontal displacement) for each of the images 612, 614,and 616, in order to generate a disparity histogram for each of theimages 612, 614, and 616, such as that shown in FIG. 10. As shown inFIG. 10, the horizontal axis represents the displacement (moreparticularly, the horizontal displacement) of at least one pixel (e.g.,a single pixel or a group of pixels) within the image underconsideration in comparison with a certain image, and the vertical axisrepresents the number of pixels, where the image under consideration canbe any of the images 612, 614, and 616, and the aforementioned certainimage can be the base image or a specific image selected from the images612, 614, and 616. By performing the disparity histogram analysis, theprocessing circuit 110 can determine whether to or how to crop the imageunder consideration, in order to perform the horizontal alignment. Forexample, the processing circuit 110 performs the horizontal alignment onthe image 612 by cropping a portion of the image 612 to obtain the image622. In another example, the processing circuit 110 performs thehorizontal alignment on the image 614 by cropping a portion of the image614 to obtain the image 624. In another example, the processing circuit110 performs the horizontal alignment on the image 616 by cropping aportion of the image 616 to obtain the image 626. Similar descriptionsare not repeated in detail for this embodiment.

According to an embodiment of the present invention, such as acombination of the embodiments respectively shown in FIGS. 5-10, theprocessing circuit 110 performs the learning/training procedure, themulti-view vertical alignment, and the horizontal alignment as disclosedabove, and further performs sequence reproduction, for reproducing the3D visual effect mentioned above. For example, the processing circuit110 performs the sequence reproduction by generating a series of warpedimages {613-1, 613-2, . . . } that vary from the image 612 to the image614 and by generating a series of warped images {615-1, 615-2, . . . }that vary from the image 614 to the image 616 to output an imagesequence {612, {613-1, 613-2, . . . }, 614, {615-1, 615-2, . . . },616}, in order to display animation based upon the images of the imagesequence {612, {613-1, 613-2, . . . }, 614, {615-1, 615-2, . . . },616}, for reproducing the 3D visual effect. Similar descriptions are notrepeated in detail for this embodiment.

FIG. 11 illustrates the apparatus 100 shown in FIG. 1 according to anembodiment of the present invention, where the processing circuit 110thereof comprises some processing modules involved with the imageprocessing method 200 shown in FIG. 4, and can selectively operate withaid of motion information generated by some motion sensors 130 whenneeded. Examples of the processing modules mentioned above may comprisea feature extraction module 1102 (labeled “Feature extraction”), afeature matching module 1104 (labeled “Feature matching”), a trainedcamera parameter prior module 1112 (labeled “Trained camera parameterprior”), a multi-view vertical alignment module 1114 (labeled“Multi-view vertical alignment”), a horizontal alignment module 1116(labeled “Horizontal alignment”), a various 3D visual effect userinterface (UI) module 1118 (labeled “UI for various 3D visual effect”),and an image warping module 1122 (labeled “Image warping”).

Based upon an input sequence that is input into the processing circuit110 (more particularly, the image data of the input sequence, such asthe image data of the plurality of images mentioned in Step 210), thefeature extraction module 1102 and the feature matching module 1104 arearranged to perform the feature extraction and the feature matchingdisclosed above, respectively, while the trained camera parameter priormodule 1112 is arranged to store results of the learning/trainingresults regarding the aforementioned solution such as the predefinedsolution space, and more particularly, some trained camera parametersthat are obtained during the learning/training procedure. In addition,the multi-view vertical alignment module 1114 and the horizontalalignment module 1116 are arranged to perform at least one portion (e.g.a portion or all) of the multi-view vertical alignment disclosed aboveand at least one portion (e.g. a portion or all) of the horizontalalignment disclosed above, respectively, where the image warping module1122 is arranged to perform image warping (more particularly, theaforementioned visually insensible image warping) when needed.Additionally, the various 3D visual effect UI module 1118 is arranged toreproduce the 3D visual effect mentioned above, and more particularly,to perform various kinds of 3D visual effects when needed.

According to this embodiment, the motion sensors 130 can be optionalsince the processing circuit 110 can operate properly and correctlywithout the aid of the aforementioned motion information generated bythe motion sensors 130, and therefore, the information paths from themotion sensors 130 to the processing circuit 110 are illustrated withdashed lines to indicate the fact that the motion sensors 130 can beoptional. However, in a situation where the apparatus 100 is equippedwith the motion sensors 130, the calculation load of the processingcircuit 110 can be decreased since the aforementioned motion informationmay be helpful. Similar descriptions are not repeated in detail for thisembodiment.

FIG. 12 illustrates two images IMG1 and IMG2 under processing ofglobal/local coordinate transformation involved with the imageprocessing method 200 shown in FIG. 4 according to an embodiment of thepresent invention, and FIG. 13 illustrates the global/local coordinatetransformation performed on the two images IMG1 and IMG2 shown in FIG.12.

As shown in FIG. 12, the processing circuit 110 can perform imageprocessing on the images IMG1 and IMG2 by performing rotating, shifting,cropping, and/or warping operations on the images IMG1 and IMG2,respectively, where the warped rectangles respectively illustrated inthe images IMG1 and IMG2 shown in FIG. 12 (i.e. those depicted withnon-dashed lines) may represent the processed results of the images IMG1and IMG2, respectively. As shown in FIG. 13, the processing circuit 110may determine a clipping region for each of the processed results of theimages IMG1 and IMG2 by virtually “overlapping” the images IMG1 and IMG2and the processed results thereof on a set of global coordinates, whichcan be regarded as a common set of global coordinates for processing theimages IMG1 and IMG2. For example, on the set of global coordinates, thestart point for the image IMG1 can be located at the origin, and thestart point for the image IMG2 can be located at a specific point on thehorizontal axis, where the clipping region can be a maximum rectangularregion available for both of the processed results of the images IMG1and IMG2.

In practice, the start point for the image IMG1 and the start point forthe image IMG2 can be determined based upon the proposed distance fromthe eyes of a viewer (e.g. the user) to the point of focus, such as theproposed distance (more particularly, the proposed horizontal distance)between the viewer and the convergence point where the sightlines of therespective eyes of the viewer converge. For example, in a situationwhere the 3D visual effect is supposed to be 3D display (or stereoscopicdisplay), the processing circuit 110 can align the images IMG1 and IMG2to background motion(s), where the embodiment shown in FIG. 14 istypical of this situation. As a result, the start point for the imageIMG1 and the start point for the image IMG2 may be close to each other.In another example, in a situation where the 3D visual effect issupposed to be the MAV visual effect mentioned above, the processingcircuit 110 can align the images IMG1 and IMG2 to foreground motion(s),where the embodiment shown in FIG. 15 is typical of this situation. As aresult, the start point for the image IMG1 and the start point for theimage IMG2 may be far from each other, in comparison with the embodimentshown in FIG. 14.

FIG. 16 illustrates a portrait mode of a plurality of display modesinvolved with the image processing method 200 shown in FIG. 4 accordingto an embodiment of the present invention, and FIG. 17 illustrates apanorama mode of the plurality of display modes according to thisembodiment, where the processing circuit 110 is capable of switchingbetween different display modes within the plurality of display modes,and more particularly, is capable of switching between the portrait modeand the panorama mode.

For example, referring to FIG. 16, some warped images bounded to theaforementioned predefined solution space (e.g. the pre-trained solutionspace) are aligned to foreground, such as the location where anactor/actress is supposed to be in front of the viewer. When it isdetected that switching to the panorama mode is required (e.g. theviewer such as the user triggers the switching operation, or apredetermined timer triggers the switching operation), the processingcircuit 110 rearrange these warped images in a reversed order andutilizes the rearranged warped images as the warped images for thepanorama mode, where the leftmost warped image shown in FIG. 16 (i.e.the warped image IMG11 thereof) is arranged to be the rightmost warpedimage shown in FIG. 17 (i.e. the warped image IMG11 thereof), and therightmost warped image shown in FIG. 16 (i.e. the warped image IMG17thereof) is arranged to be the leftmost warped image shown in FIG. 17(i.e. the warped image IMG17 thereof). This is for illustrative purposesonly, and is not meant to be a limitation of the present invention.According to a variation of this embodiment, in a situation where theprocessing circuit 110 prepares the warped images for the panorama modefirst, when it is detected that switching to the portrait mode isrequired (e.g. the viewer such as the user triggers the switchingoperation, or a predetermined timer triggers the switching operation),the processing circuit 110 rearrange the warped images for the panoramamode in a reversed order and utilizes the rearranged warped images asthe warped images for the portrait mode. Similar descriptions are notrepeated in detail for this variation.

Based upon the embodiments/variations disclosed above, the imageprocessing method 200 comprises performing automatic multiple imagealignment in terms of the aforementioned predefined solution space (e.g.the pre-trained solution space) for reproducing the 3D visual effectfrom an image sequence, in which each image can be captured with anuncalibrated camera module (e.g. the camera module 130) or anuncalibrated hand-held camera. More particularly, the image processingmethod 200 comprises learning a model consisted of physical cameraparameters, which can be utilized for performing visually insensibleimage warping. Examples of the parameters under consideration maycomprise the extrinsic parameters for rotational variations, theintrinsic parameters for camera calibration matrix and lens distortion.The image processing method 200 further comprises, from thecorresponding feature points of the input sequence, performing themulti-view vertical alignment constrained by the learned cameraparameters. The alignment process turns out a constrained optimizationproblem for image warping that is visually insensible to human vision.In addition, the image processing method 200 further comprisesperforming the horizontal alignment through the disparity analysis fromthe vertically-aligned matching points. Additionally, the imageprocessing method 200 further comprises utilizing the UI such as agraphical UI (GUI) to reproduce the 3D visual effect by using theextracted alignment information and the warped image sequence.

It is an advantage of the present invention that the present inventionmethod and apparatus can generate warped images bounded to theaforementioned solution such as the predefined solution space (e.g. thepre-trained solution space) to make the associated learned geometricdistortion be insensible by human vision, so that there is no artifactin respective reproduced image. In addition, the optimization over thesolution space according to the learned geometry constraint can alwaysgenerate rational results. Regarding the implementation of the presentinvention method and apparatus, the working flow of the associatedcalculations can be highly paralleled, and the associated computationalcomplexity is low and the required memory resource is economy. Incontrast to the related art, the present invention method and apparatusare robust to the image noises and outliers. Additionally, the presentinvention method and apparatus preserve the relative disparity/depthinformation in the warped image sequence, which is very important toimage-based 3D applications.

Those skilled in the art will readily observe that numerousmodifications and alterations of the device and method may be made whileretaining the teachings of the invention. Accordingly, the abovedisclosure should be construed as limited only by the metes and boundsof the appended claims.

What is claimed is:
 1. An image processing method, comprising: receivingimage data of a plurality of images, the images being captured underdifferent view points; and performing image alignment for the pluralityof images by warping the plurality of images according to the imagedata, wherein the plurality of images are warped according to a set ofparameters, and the set of parameters are obtained by finding a solutionconstrained to predetermined ranges of physical camera parameters. 2.The image processing method of claim 1, further comprising: determiningthe predetermined ranges of physical camera parameters by: (1) capturinga base image; (2) capturing multiple reference images; (3) recording oneset of physical camera parameters corresponding to each reference image;(4) warping the base image to match each reference image according tothe recorded set of physical camera parameters; and (5) determining thepredetermined ranges of physical camera parameters by finding whetherdifference(s) between warped base images and the reference images isdistinguishable under human vision.
 3. The image processing method ofclaim 1, wherein the image alignment includes vertical alignment andhorizontal alignment.
 4. The image processing method of claim 3, whereinthe horizontal alignment is performed after the vertical alignment isperformed.
 5. The image processing method of claim 4, wherein thehorizontal alignment is performed under disparity analysis, wherein thedisparity analysis is utilized for analyzing warped images of thevertical alignment.
 6. The image processing method of claim 1, whereinthe step of performing the image alignment further comprises:automatically performing the image alignment to reproduce athree-dimensional (3D) visual effect.
 7. The image processing method ofclaim 6, further comprising: generating 3D images to reproduce the 3Dvisual effect, where the 3D images comprise emulated images that are notgenerated by utilizing any camera.
 8. The image processing method ofclaim 6, wherein the 3D visual effect comprises a multi-angle view (MAV)visual effect.
 9. The image processing method of claim 6, wherein the 3Dvisual effect comprises a 3D panorama visual effect.
 10. The imageprocessing method of claim 6, wherein the plurality of images iscaptured by utilizing a camera module; and the camera module is notcalibrated with regard to the view points.
 11. An apparatus forperforming image processing, the apparatus comprising at least oneportion of an electronic device, the apparatus comprising: a storagearranged to temporarily store information; and a processing circuitarranged to control operations of the electronic device, to receiveimage data of a plurality of images, the images being captured underdifferent view points, to temporarily store the image data into thestorage, and to perform image alignment for the plurality of images bywarping the plurality of images according to the image data, wherein theplurality of images are warped according to a set of parameters, and theset of parameters are obtained by finding a solution constrained topredetermined ranges of physical camera parameters.
 12. The apparatus ofclaim 11, wherein the processing circuit determines the predeterminedranges of physical camera parameters by: (1) capturing a base image; (2)capturing multiple reference images; (3) recording one set of physicalcamera parameters corresponding to each reference image; (4) warping thebase image to match each reference image according to the recorded setof physical camera parameters; and (5) determining the predeterminedranges of physical camera parameters by finding whether difference(s)between warped base images and the reference images is distinguishableunder human vision.
 13. The apparatus of claim 11, wherein the imagealignment includes vertical alignment and horizontal alignment.
 14. Theapparatus of claim 13, wherein the horizontal alignment is performedafter the vertical alignment is performed.
 15. The apparatus of claim14, wherein the horizontal alignment is performed under disparityanalysis, wherein the disparity analysis is utilized for analyzingwarped images of the vertical alignment.
 16. The apparatus of claim 11,wherein the processing circuit automatically performs the imagealignment to reproduce a three-dimensional (3D) visual effect.
 17. Theapparatus of claim 16, wherein the processing circuit generates 3Dimages to reproduce the 3D visual effect, wherein the 3D images compriseemulated images that are not generated by utilizing any camera.
 18. Theapparatus of claim 16, wherein the 3D visual effect comprises amulti-angle view (MAV) visual effect.
 19. The apparatus of claim 16,wherein the 3D visual effect comprises a 3D panorama visual effect. 20.The apparatus of claim 16, wherein the plurality of images is capturedby utilizing a camera module; and the camera module is not calibratedwith regard to the view points.